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Chapter  1 
Introduction 

1.1  The  Cognitive  Challenge 

As  long  as  human  beings  have  existed  on  earth,  we  have  always  attempted  to  decode  or 
understand  our  brains.  Our  brain  is  more  advanced  than  any  other  species  which  gives 
us  capabilities  to  communicate  and  learn.  The  skill  of  communication  also  gives  us 
the  freedom  to  learn  from  others’  experiences.  Research  in  human  cognition,  formerly 
limited  to  the  fields  neuroscience,  cognitive  science,  philosophy  and  psychology,  has 
recently  been  extended  to  artificial  intelligence  where  scientists  attempt  to  recreate 
what  is  not  known  yet  to  our  species. 

In  adults,  almost  1  million  motor  neurons  control  our  muscles [26],  enabling  an 
enormous  range  of  complex  activities.  The  primary  motor  cortex  is  known  to  be 
active  when  the  body  movements  are  detected.  As  shown  in  the  somatotopic  maps  in 
Figure  1-1,  disproportionally  large  sections  of  the  motor  cortex  and  the  somatosensory 
cortex  are  responsible  for  representing  the  fingers  and  the  hand.  This  results  in  our 
capability  for  intricate  movements  and  precise  sensing  with  our  fingers. 

However,  babies  are  born  with  only  reflexive  capabilities  for  manipulative  move¬ 
ments.  A  reflex  is  an  involuntary,  stereotyped  response  to  a  sensory  input.  For 
example,  babies  curl  their  fingers  when  the  palm  is  stimulated.  This  capability,  in 
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Figure  1-1:  Human  somatotopic  mappings:  left,  motor  cortex;  right,  somatosensory 
cortex. 

conjunction  with  babies’  curiosity  and  visual  feedljack  is  the  bases  from  which  they 
learn  to  manipulate  objects  and  results  in  the  eventual  large  portion  of  cortex  map¬ 
ping.  Unfortunately  it  is  still  unclear  why  and  how  these  connections  develop  in  our 
brain.  In  neural  and  computer  science,  many  learning  strategies  are  developed  based 
on  our  learning  properties.  However,  they  are  full  of  assumptions  and  definitions  that 
are  not  necessarily  valid  in  the  real  w'orld,  such  as  Markov  chain  condition.  As  one 
of  the  steps,  our  approach  to  this  complex  phenomena  is  to  reconstruct  our  behav¬ 
ior  and  study  the  learning  process  using  our  faster  than  ever  calculation  power  of 
computers, in  order  to  provide  insight  into  the  human’s  brain  functionalities. 


1.2  The  Physical  Challenge 

Sensory  information  is  first  detected  by  the  receptors  which  is  routed  and  processed 
within  the  nervous  system  to  interacts  with  the  brain.  Mechanoreceptors,  receptors 
that  respond  to  physical  deformation,  are  responsible  for  touch  and  pain.  The  ridges 
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on  the  fingers  orient  cutaneous  mechanoreceptors  called  Meissner  corpuscles,  and  they 
are  largely  responsible  for  our  ability  to  perform  fine  tactile  discriminations  with  our 
fingertips.  The  receptors  monitor  the  environment  and  transduce  the  information 
which  is  then  propagated  and  passed  toward  the  spinal  cord.  The  spinal  cord,  being 
only  about  42  cm  long  x  1  cm  dia.meter,  receives  all  the  motor  and  sensory  inputs, 
which  are  fed  into  multiple  ascending  sensory  pathways  and  local  reflex  circuits.  Un¬ 
fortunately,  the  current  connector  and  wire  technology  does  not  allow  us  to  build 
such  a  system  due  to  the  size  and  inorganic  material  limitations.  Even  when  only  one 
hundred  30  gauge  stranded  wires  are  run  through  a.  small  joint  and  repetitive  strain 
is  applied,  the  wires  are  prone  to  breakage  due  to  the  inflexibility  characteristics  of 
conductive  materials. 

The  human  body  is  adaptable  to  situations  and  tasks  which  can  be  learned  through 
training  using  the  same  physical  body  parts.  To  date,  most  mechanical  hands  and 
grippers  constructed  are  task  driven  and  limited  to  performing  a  very  few  specific 
tasks.  They  may  excel  in  their  precision  and  strength  for  a  particular  task,  but  their 
inflexibility  to  perform  non-specifled  tasks  make  the  existing  hands  nonhuman.  The 
human  hand  is  an  amazing  device,  capable  of  manipulating  diverse  objects  and  tasks, 
yet  its  precision  and  srength  requires  more  external  muscular  assistance,  feedback,  and 
training  than  we  imagine.  The  challenge  is  to  build  a  system  that  is  not  preconfigured, 
but  is  able  to  learn  to  accomplish  many  tasks  like  our  hands. 


1.3  Terminology 

Many  parts  of  hands  and  fingers  and  discussed  through  out  this  thesis.  For  a  sim¬ 
plicity,  the  terminolgies  are  based  on  the  human  anatomy  terminologies  shown  in 
Figure  1-2  [24]. 

The  mechanical  hand  constructed  for  this  thesis  have  three  fingers,  each  having 
two  segments  and  two  joints,  and  a  thumb  with  one  segment  and  a  joint,  so  the  terms 
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Figure  1-2:  Human  anatomy  terminology, 
are  altered  as  shown  in  Table  1.1. 


1.4  Organization  of  Thesis 

This  thesis  is  organized  into  4  additional  chapters  as  follows: 

Chapter  2  discusses  the  motivation  for  embodiment  in  this  project.  It  introduces 
the  behavior  of  humans  and  related  research  previously  done  which  leads  to  the  cur¬ 
rent  humanoid  research.  It  also  argues  the  importance  and  dilSculties  of  embodiment. 
Embodiment  is  one  of  the  best  approaches  in  order  to  learn  about  human  cognition, 
but  due  to  mechanical  dufficulties,  many  constraints  are  considered. 

Chapter  3  presents  a  detailed  description  of  the  hand  built  for  this  research.  It 
covers  the  mechanical  design  and  implementation  including  the  structure  of  physical 
hand,  tendon  cabling  strategy,  actuators,  sensors  and  computing  tools. 


1.4.  ORGANIZATION  OF  THESIS 


19 


Area 

Part 

Terminology 

fingers 

2nd  digit 

index  finger 

3rd  digit 

middle  finger 

4th  digit 

ring  finger 

segment  further  away  from  palm 

Distal 

segment  clo.ser  to  palm 

Proximal 

joint  further  away  from  palm 

distal  joint 

joint  closer  to  palm 

proximal  joint 

thumb 

segment 

Proximal 

joint 

proximal  joint 

palm 

inside 

palm 

outside 

dorsum 

all 

segments 

phalanges 

joints 

joints 

Table  1.1:  Terminologies  of  mechanical  hand  parts  used. 


Chapter  4  has  two  parts.  The  first  part  describes  the  PID  controller  which  is  used 
locally  to  incorporate  the  primitive  motion  of  the  hand.  The  second  part  presents 
the  learning  strategies  which  is  inspired  by  an  infant’s  learning  process.  Strategies 
such  as  competitive  learning,  back-propagation  algorithm  and  reinforcement  learning 
are  introduced  and  implemented.  The  experimental  results  are  also  shown  in  this 
chapter. 

Chapter  5  reviews  the  research  discussed  in  this  thesis  and  concludes  with  a  dis¬ 


cussion  of  the  future  work. 
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Chapter  2 
Embodiment 


This  chapter  presents  the  motivation  for  embodiment  and  illustrates  its  significance 
to  this  thesis.  Humans’  cognitive  and  ph3^sical  behavior  is  discussed,  with  focus  on 
infants’  manipulation  behavior.  The  hand  built  is  a  self-contained  human  scaled  non- 
task  driven  tool  learning  its  own  cognitive  and  physical  behavior,  differentiating  this 
research  from  previous  work  in  manipulation  tools.  The  advantages  and  disadvantages 
associated  with  building  such  a  system  are  considered. 

2.1  Motivation  and  Related  Work 

2.1.1  Infants 

Piaget  was  one  of  the  first  of  the  modern  psychologists  to  recognize  the  infant’s 
manipulative  exploratory  behavior  with  the  environment  as  a  vehicle  of  cognitive 
stimulation[22].  Infancy  is  not  only  a  time  when  muscles  and  the  nervous  system  ma¬ 
ture,  but  also  a  time  of  active  and  continuous  learning  which  allows  a  baby  to  establish 
effective  transactions  with  the  environment  and  move  toward  a  greater  degree  of  au¬ 
tonomy.  During  this  time,  infants  practice  and  perfect  sensorimotor  patterns  that 
become  behavioral  modules  which  will  be  seriated  and  imbedded  in  more  complex 
actions. 


21 


22 


CHAPTER  2.  EMBODIMENT 


Human  motor  control  is  a  sequential  process  which  is  affected  by  the  order  of 
development  of  different  regions  of  the  brain  and  the  nervous  system.  Since  the  control 
of  the  central  body  areas  matures  before  the  outer  areas,  hand  development  comes 
later  than  for  other  parts  of  the  body.  Consequently  arm  motion  controlled  by  a  more 
mature  shoulder  joint  causes  accidental  collisions  with  objects  in  the  environment  as 
the  infants  come  in  contact  with  an  increasing  number  and  variety  of  objects.  Reflex  is 
the  only  hand  motor  control  present  at  birth.  When  the  skin  of  the  palm  is  touched 
by  an  object,  the  muscles  of  the  hand  contract  and  results  in  curling  the  fingers, 
whereas  if  a  strong  force  is  applied,  the  fingers  open  to  alleviate  pain  by  expanding 
the  muscles.  The  reflex  is  completely  controlled  and  pre-programmed  at  the  spinal 
cord  and  the  summary  ol  the  reaction  reaches  the  brain  long  after  the  action  has  been 
taken.  When  this  process  repeats  itself,  the  nervous  system  makes  the  connection 
between  the  stimulus  and  its  corresponding  actions,  resulting  in  the  first  step  of 
manipulation  learning.  Through  touching  the  objects,  babies  learn  their  shapes, 
dimensions,  slopes,  edges,  and  textures.  They  also  finger,  grasp,  push,  and  pull  to 
learn  the  material  variables  of  heaviness,  mass  and  rigidity,  as  well  as  the  changes 
in  visual  and  auditory  stimuli  that  objects  provide.  Visual  feedback  is  a  crucial 
piece  in  manipulation  learning  as  seen  in  the  infants  of  a  few  days  old  extending  their 
hands  toward  a  visible  object [28].  This  instinctive  motivating  information  is  triggered 
somewhere  in  the  nervous  system  and  allows  explorative  learning  to  initiate. 

2.1.2  Mechanical  Hand 

Since  the  eighteenth  century  the  mechanics  of  hands  has  been  studied  and  has  been 
the  model  for  various  mechanical  constructions,  primarily  for  protheses  and  telema¬ 
nipulators,  manipulators  controlled  remotely [21].  More  recently,  human  hands  have 
been  analyzed  for  industrial  mechanical  grippers  and  many  of  them  are  used  reliablj^ 
in  assembly  settings.  They  are  built  specifically  for  the  environment  in  which  the 
grippers  have  to  work,  and  they  are  so  different  for  each  application  that  a  standard 
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industrial  hand  that  satisfies  every  need  cannot  be  built.  Their  functions  are  mostly 
clamping,  vacuum,  and  magnet  which  are  activated  by  pneumatic,  hydraulic,  electric 
and  mechanical  force. 

The  first  dexterous  mechanical  hand  that  resembled  a  human  hand  was  the  Utah/MIT 
Dextrous  Hand  built  about  10  years  ago[14].  The  hand  itself  was  approximately  an¬ 
thropomorphic  in  size,  including  three  tendon  operated  fingers  and  a  thumb  with 
multichannel  touch  sensing  capability.  Each  finger  included  three  parallel  axis  joints 
and  a  proximal  joint  which  are  independently  controlled  using  a  tendon  system  to¬ 
talling  eight  tendons  and  actuators  per  finger.  38  actuators  are  mounted  in  the 
forearm  for  controlling  the  tendon,  and  a  pneumatic  approach  is  used  due  to  its  low 
weight  and  compactness.  Optical  fibers  and  birefringent  materials  were  used  for  their 
touch  sensing  system.  The  control  system  simply  delivered  joint  angle  commands  to 
servo  systems  at  each  joint  so  that  the  hand  assumed  various  desired  configurations 
integrating  touch  sensors  and  tendons.  This  work  was  significant  in  a  way  that  it 
could  be  used  for  multiple  purposes  in  research,  giving  the  capability  to  inegrate  ad¬ 
ditional  systems  such  as  learning  algorithms  or  more  sensors  in  an  anthropomorphic 
way. 

2.1.3  Humanoid 

Attempts 

Originally,  most  humanoid  robots  were  clever  adaptations  of  existing  industrial  robots 
or  specialized  mechanical  arms.  Later  there  were  explicit  attempts  to  make  robots 
anthropomorphical  in  appearance  and  capabilities.  Wabot  was  exhibited  at  the 
Japanese  Expo  in  1985  and  it  played  a  piano,  with  its  precise  and  fast  finger  works[32]. 

It  had  a  human  appearance  and  if  examined  briefly,  it  could  visually  fool  people  that 
it  had  a  cognitive  system.  Though  this  robot  design  was  inspired  by  the  human 
hand  motor  system,  it  was  not  practical  in  any  sense  of  the  word.  It  was  bolted  in 
front  of  a  piano,  and  the  only  capablility  it  had  is  to  play  a  piano.  No  other  tasks. 
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Figure  2-1:  A  picture  of  Cog. 


even  just  to  manipulate  objects,  could  have  been  done  by  the  robot.  While  various 
engineering  enterprises  have  modeled  their  artifacts  after  humans  to  one  degree  or 
another,  nobody  seriously  tried  to  couple  human  like  cognitive  processes  to  these 
systems  methodologically. 

Cog 

At  the  MIT  Artificial  Intelligence  Laboratory,  a  research  group  headed  by  professors 
Rodney  A.  Brooks  and  Lynn  Andrea  Stein  is  currently  developing  an  integrated  phys¬ 
ical  humanoid  robot  named  Gog  [3]  shown  in  Figure  2-1.  This  system  will  include 
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vision,  sound  input  and  output,  and  dextrous  manipulation  all  controlled  by  a  con¬ 
tinuously  operating  parallel  AdIMD  computer  as  the  brain.  The  processors  are  16Mhz 
Motorola  68332s  in  standard  boards  which  plug  16  to  a  backplane.  The  backplane 
provides  each  processor  with  six  communications  ports  and  a  peripheral  processor 
port.  It  has  the  capability  to  connect  up  to  256  processors  by  stacking  16  backplanes 
to  a  single  front  end  processor.  Each  68332  communicates  up  to  16  Motorola  6811s 
which  are  single  chip  processor  with  onboard  memory,  timer,  SPI,  analog  to  digital 
convertor,  and  some  I/O  ports.  The  motor  skills  that  are  handled  at  the  spinal  level 
for  humans  are  processed  by  6811  motor  boards  to  act  like  spinal  cords.  The  goals  of 
this  project  are  to  build  a.  prototype  general  purpose  autonomous  robot  and  to  un¬ 
derstand  human  cognition.  This  is  the  first  time  an3^one  has  attempted  to  construct 
an  embodied  autonomous  humanoid  intelligent  robot. 

Currently  we  are  at  a  primitive  building  and  integrating  stage  in  hardware  and 
software  including  arms,  hands,  ears  and  eyes.  As  we  put  the  pieces  together  we 
will  be  forced  to  understand  the  physical  constraints  which  can  lead  to  a  better 
understanding  of  how  we  should  build  the  pieces.  When  all  the  parts  are  integrated 
to  our  one  front  end  processor,  we  will  be  able  to  treat  Cog  as  a  whole  to  attack 
problems  that  require  coordinating  the  whole  bodjc  A  simple  operation,  such  as 
picking  up  a  bell,  recpires  sound  localization,  tor.so  control,  visual  feedback  and  arm 
and  hand  manipulation  skills.  This  kind  of  task  may  only  be  done  at  the  cognitive 
level  using  a  system  like  what  we  are  building  right  now.  Cognitively,  this  project  is 
important  because  studying  the  way  Cog  decides  to  execute  certain  actions  may  lead 
to  an  understanding  of  our  own  cognition. 
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2.2  Embodiment  of  Hand 

2.2.1  Overview 
Why? 

The  importance  of  embodiment  in  order  to  understand  human  cognition  is  a  contro¬ 
versy  in  the  artificial  intelligence  community.  Many  argue  that  a  simulation  of  such  a 
system  can  satisfy  the  need,  and  would  not  waste  the  time  needed  to  build  a  complex 
hardware  creature.  We  live  in  a  noisy  environment  and  we  are  capable  of  learning 
to  ignore  irrelevant  noise.  For  example,  we  can  recognize  a  telephone  even  when  the 
edges  are  dirt}^  or  chipped,  which  cannot  be  easily  done  with  the  current  computer 
vision  technology.  We  are  restricted  by  the  limited  technology  that  allows  us  to  build 
such  a  system,  but  also  limited  by  what  we  know  about  human  biological  systems. 

Another  example  to  show  the  importance  of  embodiment  is  the  study  of  bird 
wings.  The  physics  of  bird  wings  have  been  studied  to  embody  in  a  human  scale  with 
our  dream  to  fly  since  the  16th  century.  With  a  solid  understanding  of  aerodynamics, 
a  computer  simulation  can  be  built  to  understand  the  fuctionality  of  the  wings  better. 
Even  for  a  simulating  such  a  simple  environment  as  air,  many  assumptions  such  as 
wind  and  pressures  need  to  be  made  in  order  for  the  simulation  to  work  consistently. 
While  studying  such  a  system  can  show  important  points  in  the  flying  mechanism, 
the  svstem  still  need  to  be  physically  built  to  understand  other  constraints  that  occur 
only  in  the  real  world  setting. 

The  attempt  to  understand  human  functionality  is  much  more  complicated  than 
studying  bird  wings.  Many  assumptions,  probably  including  some  that  are  not  valid 
in  the  real  environment,  are  necessary  because  we  do  not  know  enough  about  how 
we  process  information  that  we  receive  from  the  environment.  Therefore,  it  is  more 
crucial  to  build  such  a  system  physically  to  understand  its  constraints  and  limitations. 
As  we  build  the  system,  still  with  many  assumptions  and  using  existing  technology, 
we  may  realize  human’s  functionalities  that  simulations  have  not  been  able  to  teach 
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Figure  2-2:  A  picture  of  Cog’s  hand. 


us.  By  attempting  to  build  an  anthropomorphic  hand,  many  physical  limitations 
and  constraints  are  realized,  and  those  realizations  are  requisite  to  unraveling  the 
cpiestions  of  human  physical  and  cognitive  functions. 

Physical  Setting 

This  project  uses  an  anthropomorphic  scale  hand  which  has  three  fingers  and  an 
opposing  thumb  shown  in  Figure  2-2.  Each  finger  has  two  coupled  joints  that  are 
controlled  by  a  miniature  steel  cable.  Due  to  the  nature  of  a  coupled  cabling  strat¬ 
egy,  it  is  compliant.  There  are  four  motors  controlling  each  finger,  generating  a 
maximum  torque  equivalent  to  holding  a  0.5  pound  object  at  the  tip  of  a  finger. 
Motors  are  integrated  with  rotational  potentiometers  to  detect  the  motor  positions. 
Force/pressure  sensors  cover  the  surface  of  all  fingers.  A  finger  has  two  phalanges 
and  each  of  it  has  two  force  sensors.  The  thumb  has  two  force  sensors  and  the  palm 
has  four  position  sensors  in  addition  to  a  force  sensor.  All  the  sensory  readings  are 
multiplexed  and  converted  to  digital  signals  at  a  Motorola  6811  microcontroller  which 
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is  integrated  on  the  top  of  the  dorsum.  The  6811  has  four  analog-digital  converter 
ports,  four  pulse  width  modulator  ports  which  are  connected  to  the  motor  drivers  and 
all  four  motors.  The  microcontroller  acts  like  a  spinal  cord,  containing  a  PID  control 
loop  and  handling  reflexive  reactions.  A  larger  microcontroller  68332  is  interfaced  for 
higher  operations  such  as  learning  and  coordinating  with  other  features  such  as  eyes 
and  ears. 

2.2.2  Constraints 

Strength  and  Precision 

Many  researchers  have  successfully  created  hands  that  are  reasonably  small  and 
strong,  interfaced  with  large  forearms  in  order  to  carry  many  high-powered  motors, 
precision  encoders,  and  gears  [2,  30,  31].  However,  in  creating  a  human  scale  model, 
it  is  crucial  to  minimize  the  weight  and  the  size  of  the  hand.  As  a  trade  off,  increas¬ 
ing  the  strength  and  the  precision  becomes  complex.  Minimizing  wires  and  cables  is 
achieved  by  placing  actuators  close  to  the  joints,  and  local  processors  close  to  all  the 
sensors  and  motors.  Optimally  everything  should  be  contained  within  the  fingers  and 
the  palm.  In  order  to  contain  motors  in  the  hand,  both  the  number  and  the  size  of 
motors  need  to  be  decreased  significantly  from  all  the  existing  mechanical  hands.  To 
reduce  the  number,  all  the  joints  in  a.  finger  are  coupled  with  a  tendon  cable  which  is 
pulled  from  both  directions  for  curling  and  expanding  by  a  single  motor.  This  strategy 
limits  the  strength  of  the  hand  due  to  the  conciseness  of  the  motors,  the  compliancy 
of  the  cabling  strategy,  and  the  material  of  cables.  To  avoid  using  large  encoders, 
rotational  potentiometers  are  used  at  the  expense  of  reducing  the  accuracy  from  16 
bits  to  8  bits  of  information.  Needless  to  say,  small  parts  are  difficult  to  construct, 
which  increases  the  complexity  of  these  constraints.  Though,  when  the  human  hand 
mechanism  is  analyzed  for  infants,  each  finger  has  minimal  torque  and  it  is  impossible 
to  even  estimate  the  angles  of  the  joints  without  visual  feedback  or  external  applied 
force.  Thus,  studying  infants’  learning  skills  only  reciuires  the  strength  of  industrially 
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available  miniature  motors  and  the  precision  of  potentiometers  read  through  8  bit 
analog  to  digital  converters. 


Stability  and  Orientation 

For  multi-finger  manipulators,  stablizing  the  grasp  is  a  critical  issue.  According  to 
some  investigations  done  in  the  past,  a  four  huger  manipulator  can  handle  99%  of  the 
parts  that  a  hve  hnger  manipulator  or  human  hand  can  handle,  a  three  hngers  can 
handle  90%  and  two  hngers  40%.  For  the  humanoid  hand,  a  three  hnger  with  a  thumb 
conhguration  is  used  to  reinforce  the  stabilitj^  for  various  shaped  object  manipulation 
[30].  For  example,  the  last  hnger  can  be  used  as  the  base  to  hold  a  small  object. 
Young  infants  do  not  use  the  thumb  as  an  opposing  hnger,  and  use  all  hngers  like  a 
one  degree  of  freedom  compliant  gripper.  As  learning  proceeds,  the  opposing  thumb 
becomes  the  most  important  hnger  for  manipulation  and  slowly  increases  the  degrees 
of  freedom  to  more  than  twenty-hve,  though  many  are  coupled  by  the  nature  of  the 
ligament  structure  and  location  of  tendon  insertions.  For  our  embodied  hand,  all  four 
hngers  have  a  designated  motor  which  gives  each  one  degree  of  freedom.  However 
three  hngers  have  two  coupled  joints  yielding  a  total  of  seven  degrees  of  freedom 
visually.  From  the  construction  of  the  hand,  various  objects  can  be  manipulated 
within  the  torcjue  limit  of  the  arm  and  the  hand. 

The  orientation  of  the  hand  during  reaching  is  an  important  part  of  a  grasping 
procedure.  Babies’  initial  reaches  are  awkward,  but  learn  to  coordinate  and  turn  it 
into  a  smooth  movement  within  a  few  months [34].  The  initial  reaction  during  reaching 
is  to  orient  the  palm  toward  the  desired  point  of  contact,  and  preshape  the  hngers 
according  to  the  shape  of  the  object.  Unfortunately,  without  visual  feedback  or  arm 
movement  coordinating  with  the  hand,  those  precedures  need  to  be  ignored.  For  this 
research,  the  orientation  of  the  hand  is  hxed  to  have  the  palm  perpendicular  to  the 
ground  for  simplicity. 
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2.2.3  Sensorimotor  System 

Meissner  corpuscles  are  elongated  encapsulated  endings  that  are  oriented  with  their 
long  axis  perpendicular  to  the  sulace  of  the  skin.  They  are  quite  numerous  in  the 
skin  of  fingertips,  and  they  are  largely  responsible  for  our  ability  to  perform  hire 
tactile  discriminations  with  our  finger  tips.  Unfortunately,  this  system  is  still  not  well 
enough  understood  to  implement  it  to  an  inorganic  form.  Tactile  sensing  research  is 
an  ongoing  field  where  any  commercially  available  skin  is  not  good  enough  yet  to  be 
interfaced  to  achieve  human  like  precision.  For  creating  a  human  like  system,  many 
constraints  need  to  be  considered  to  find  an  optimal  solution  within  our  existing 
technology.  First,  the  skin  needs  to  be  flexible  to  adopt  the  shape  of  the  surface  of 
fingers  and  palm.  Second,  the  size  and  the  number  of  wires  needs  to  be  minimized 
for  creating  a  human  scaled  hand.  The  phalanges  are  hollowed  to  allow  wires  to  run 
through  them,  but  it  is  still  a  very  limited  space. 

One  of  the  main  goals  of  this  project  is  to  learn  from  building  a  cognitive  system 
and  learn  how  such  a  system  should  be  built.  For  this  purpose,  we  can  start  off  using 
a  tactile  system  that  is  not  as  accurate  as  human  finger  skin.  If  the  cognitive  system 
we  build  tells  us  in  the  future  that  more  precise  tactile  sensors  play  a  crucial  role  for 
learning,  we  will  try  to  add  such  a  system.  Since  the  most  important  information 
needed  is  the  force  information  followed  by  the  position  of  contact,  many  force  sensors 
and  several  positions  sensors  are  used  for  the  hand. 


2.2.4  Learning 

Learning  manipulation  in  an  unpredictable,  changing  environment  is  a  complex  task. 
It  requires  a  nonlinear  controller  to  respond  in  a  nonlinear  system  that  contains 
significant  amount  of  sensory  inputs  and  noise[23].  Investigating  the  human  manip¬ 
ulation  learning  system  and  implementing  it  in  a  physical  system  has  not  been  done 
due  to  its  complexity  and  too  many  unknown  parameters.  Conventional  adaptive 
control  theory  assumes  too  many  parameters  that  are  constantly  changing  in  a  real 
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environment  [33,  37].  For  an  embodied  hand,  even  the  simplest  form  of  learning  pro¬ 
cess  recpiires  more  intelligent  control  network.  Wiener  [36]  has  proposed  the  idea  of 
“Connectionism”,  which  suggests  that  a  muscle  is  controlled  by  affecting  the  gain  of 
the  “efferent-nerve  -  muscle  -  kinesthetic-end-body  -  afferent  nerve  -  central-spinal- 
synapse  -  efferent-nerve”  loop.  Each  system  within  the  loop  such  as  efferent  nerve 
contains  its  own  feedback  loop  system.  This  kind  of  loop  is  inherently  nonlinear 
with  the  capability  to  take  many  noisy  inputs  and  may  be  implemented  in  a  physi¬ 
cal  hand.  It  is  still  very  limited  to  what  kind  of  learning  strategies  can  be  used  for 
an  implementation,  but  as  an  individual  system,  standard  competitive  learning  and 
backpropagation  algorithms  are  used.  To  connect  the  whole  system,  a  connectionist 
implementation  of  reinforcement  learning  is  irsed  for  the  embodied  hand. 
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Chapter  3 


Hardware  Design 


This  chapter  presents  the  hardware  design  of  Cog’s  hand.  The  hand  is  made  of 
aluminum  and  designed  to  minimize  weight  and  size.  It  has  a  microprocessor  and 
sensor  interface  circuit  on  top  of  the  dorsum  and  has  36  total  sensors  on  the  surface 
and  joints. 

3.1  Mechanical  Design 

3.1.1  The  Structure  of  Hand 

The  hand  has  a  4.0  inch  x  4.0  inch  palm  with  three  fingers  and  an  opposing  thumb 
where  the  diameter  of  fingers  is  1.0  inch.  To  minimize  the  weight  and  allow  for  space 
to  run  cables  and  wires,  each  phalange  is  hollowed  out  using  a  lathe  to  0.02  inch 
thickness.  .Joint  design  is  done  as  in  Figure  .3-1  by  setting, 


max(^) 

=  95° 

(3.1) 

max((;^) 

O 

O 

(3.2) 

where  9  is  the  angle  for  the  proximal  joint  and  <f)  for  the  distal  joint.  There  are 
physical  limits  at  the  proximal  joint  and  the  distal  joint  as  shown  in  Figure  3-2  so 
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Figure  3-1:  A  diagram  of  joint  with  a  bending  angle:  frontal  view(left),  side 
view(riglit). 

that  when  fingers  are  fully  open, 


min(0)  =  -5°  (3.3) 

min(^)  =  —10°.  (3.4) 

Within  a  joint,  there  is  a  miniature  steel  pulley  of  diameter  0..5  inches  and  a  shaft 
that  is  fixed  to  the  phalange  above  the  joint(i.e.,the  pulley  in  distal  joint  is  fixed  to 
Distal),  and  friction  is  minimized  using  miniature  ball  bearings.  Cables  are  run  in 
such  a  way  that  both  curling  and  expanding  are  controlled  using  one  continuous  cable 
and  one  motor  as  shown  in  Figure  3-3.  This  cabling  mechanism  works  because  the 
rotational  force  applied  by  a.  motor  results  in  a  tension  in  the  cable  that  causes  the 
friction  force  of  the  pulleys  to  move  the  joints(Figure  3-4).  The  steps  of  applied  and 
induced  force  effects  of  this  mechanism  are  illustrated  using  a  finger  curling  example: 

1 .  A4otor  applies  a  tension  to  the  inner  cable. 

2.  Friction2  becomes  strong  enough  to  rotate  the  pulley  in  the  proximal  joint. 
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Physical 
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Contact 
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Figure  3-2:  Physical  limits  for  a  proximal  joint(left)  and  a  distal  joint(right) 


Opening  Motion 


Closing  Motion 


Figure  3-3:  Cabling  configurations  of  curling  and  expanding  motion. 
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friction2  friction  1 


motor 


Figure  3-4:  Theory  of  cabling  mechanism  with  applied  and  induced  forces. 

3.  Proximal  comes  in  a.  contact  with  an  object  or  reaches  a  phj^sical  limit  causing 
a  resistive  force. 

4.  The  resistive  force  from  step  3  overcomes  friction2  causing  the  cable  to  slip  over 
the  pulley  in  the  proximal  joint  applying  tension  in  the  cable  in  Proximal. 

5.  Frictionl  is  induced  to  rotate  the  pulley  in  the  distal  joint. 

6.  Distal  comes  in  a  contact  with  an  object  or  reaches  a  physical  limit  causing  a 
resistive  force. 

7.  The  cable  reaches  its  maximum  tension  and  stops.  This  is  an  optimal  grasping 
conhguration  for  this  huger. 

To  achieve  such  a  coupling  effect  for  the  joints,  the  tension  of  the  cable  and  the 
potential  friction  for  the  surface  of  pulleys  need  to  be  considered  in  detail.  If  the 
pulley  potential  friction  is  too  high,  the  resistive  force  at  step  3  cannot  overcome  the 
friction  and  the  distal  joint  could  not  be  controlled.  When  the  cable  tension  is  higher 
than  recjuired  as  the  huger  curls,  the  proximal  joint  is  controllable  whereas  the  distal 
joint  cannot  be  moved.  The  proximal  joint  is  still  controllable  because  the  tension  of 
the  inner  cable  applied  by  the  motor  is  larger  compared  to  the  force  against  it  due 
to  a  minor  slip  of  outer  cable  that  occurs  within  the  proximal  joint  to  alleviate  the 
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friction  out 


Figure  3-5:  A  free-body  diagram  for  a  distal  joint  pulley. 

tension  between  the  motor  and  the  proximal  joint.  Due  to  this  effect,  the  tension  of 
the  outer  cable  between  the  distal  joint  and  the  proximal  joint  increases  and  induces 
the  frictional  force  against  the  direction  of  friction  that  causes  the  joint  movement 
as  shown  in  Figure  3-5.  As  a  result,  there  is  not  enough  friction  to  move  the  distal 
joint.  When  the  tension  is  too  low,  the  compliance  becomes  too  large  to  weaken  the 
grasping  force.  The  optimal  total  cable  length  is  calculated  using  the  formula, 

2  —  ‘^{^Distal  T  Hroximal  "h  •S)  -f  (3-5) 

where  L  is  the  total  cable  length,  is  the  length  of  m,  s  is  the  length  between 
the  distal  joint  to  the  cable  terminal  point  and  d  is  the  diameter  of  pulleys.  0.04T 
is  added  to  achieve  an  optimal  tension  and  compliancy.  The  material  of  the  cable, 
nonstretching  nylon  coated  steel  is  chosen  for  its  durable  characteristics,  but  it  still 
stretches  over  time.  A  tension  cranker  is  designed  as  shown  in  Figure  3-6  so  that 
tension  can  be  adjusted  to  an  optimal  strength  when  the  cable  is  stretched  over  time. 
The  cable  is  terminated  using  cable  locks  within  Distal  as  shown  in  Figure  3-7. 

At  the  palm,  the  fingers  are  separated  by  0.5  inches  and  the  outer  fingers  are  fixed 
to  the  palm  at  15  degrees  away  from  the  middle  finger.  Each  finger  has  two  phalanges 
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Figui'e  3-6:  Tension  cranker  design  for  adjusting  the  stretched  cable  length. 


cable  cable  lock 
pulley  \ 


0.1  in 

Figure  3-7:  Cable  termination  using  cable  locks. 
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and  their  lengths  are  chosen  to  avoid  colliding  with  other  fingers,  yet  allowing  the 
tips  of  all  fingers  to  meet  at  one  point  when  they  are  fully  closed,  which  is  shown  in 
Figure  3-8.  Using  this  figure,  Distal  and  Proximal  lengths  are  calculated  using. 


Figure  3-8:  Diagram  of  hand  used  to  determine  the  length  of  phalanges:  sideview(left) 
and  front  view  ( right ) . 


X  = 

/i|  cos  6\  -|-  Ul  cos(^  +  0)  \  L  b 
a  -f  0.5 

(3.6) 

h  = 

tan  15° 

0.5 

(3.7) 

a  = 

sin  75° 

0.75  +  0.75 

(3.8) 

X  — 

- 1-  c 

tan  15° 

(3.9) 

c  = 

0.75  sin  15°. 

(3.10) 
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and  having  a  total  finger  length  to  be  4.5  inches,  a  set  of  linear  equations  can  be 
formulated  to 


/1  +  /2  =  4.50  (3.11) 

0.087/1  +  0.996/2  =  1.999  (3.12) 

which  gives  the  length  of  Distal  to  be  2./ 5  and  the  length  of  Proximal  to  be  1.  (O  and 
the  phalanges  are  built  accordingly.  The  tijr  of  a  finger  is  made  of  polyethylene,  and 
covered  with  vinyl.  An  opposing  thumb  has  one  degree  of  freedom  and  the  length  is 
chosen  to  meet  with  other  finger  tips  for  the  purpose  of  fine  manipulation.  It  is  fixed 
to  the  palm  aird  the  proximal  joint  is  controlled  with  a  steel  cable  as  iir  the  other 
joiirts.  Because  this  joint  is  not  coupled,  the  torque  exerted  for  the  thumb  is  larger 
than  for  the  other  fingers. 


3.1.2  Motor  Selection 

There  are  four  motors  controlling  each  finger  and  they  are  contained  within  the  palm 
(see  Figure  3-9)  to  minimize  the  size.  The  motors  and  gearboxes  were  chosen  by 
calculating  the  required  torque  and  speed.  At  no  load,  the  desired  maximum  angular 
velocity  of  the  joints  is  2  rps  =  120  rpm,  permitting  the  finger  to  open  and  close 
fully  in  0.5  seconds.  Considering  finger’s  own  weight  and  applied  force,  it  is  assumed 
that  the  overestimated  maximum  load  is  1/2  pound  centered  one  inch  away  from  the 
motor.  With  this  assumption,  the  stall  torque  is 

r  =  0.51bs  X  lin  =  16.0oz-in.  =  0.r2Nm.  (3-13) 


Therefore  the  required  power  of  the  motor  assuming  60  percent  efficiency  is 


0.12(47r) 


=  l.bWatts. 


P  =  Tu,'  = 


0.60 


(3.14) 
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Steel  Cable 


Pulley 


Figure  3-9:  Inside  the  palm. 


Maximum  intermittent  power  outiDut(Watts) 

2.7 

Maximum  continuous  power  output(VVatts) 

2.0 

Maximum  efficiency(%) 

76 

No  load  speed  (RPM) 

11,300 

Stall  torque(oz-in.) 

1.25 

Maximum  continuous  torciue(oz-in.) 

0.35 

Weight  (oz) 

0.71 

Table  3.1:  The  characteristics  of  MicroMo’s  DC  MicroMotor  1331. 

To  meet  these  criteria  and  to  minimize  weight  and  size,  MocroMo’s  DC  motor  series 
1331  with  a  15/5  76:1  gearbox  was  chosen.  The  characteristics  of  the  motor  and  the 
gearbox  are  shown  in  Table  3.1  and  Table  3.2. 


3.1.3  Grasping  Capability 

The  size  of  objects  to  be  manipulated  is  largely  determined  by  the  length  of  the 
phalanges.  By  taking  advantage  of  the  four  fingered  hand,  large  or  non-trivial  shaped 
objects  may  be  grasped.  For  example,  ring  finger  can  be  used  as  a  base  to  hold  a  large 
object  that  other  fingers  cannot  wrap  all  the  way  around.  The  manipulability  is  also 
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Reduction  ratio 

76  :  1 

Maximum  continuous  output  torque(oz-in.) 

14.2 

Maximum  intermittent  output  torque(oz-in.) 

42.4 

Efficiency  (%) 

68 

\Wight(oz) 

0.61 

Table  3.2:  The  characteristics  of  MicroMo’s  gearhead  15/5. 


dependent  on  the  material  of  the  object  grasped.  The  surface  of  the  hand  is  coceied 
with  a  thin  layer  of  vinyl  to  increase  friction.  When  the  static  friction  between  the 
skin  and  the  object  overcomes  the  gravitational  force,  the  object  does  not  slip  off. 
To  analyze  the  friction,  one  point  of  contact  with  an  object  is  considered.  The  static 
friction  between  the  object  and  the  skin  is  given  by, 

/  <  f'sN  (3.15) 

where  /  is  the  frictional  force,  //^  is  the  coefficient  of  static  friction,  and  N  is  the 
magnitude  of  the  normal  force.  Figure  3-10  shows  the  ol)ject  at  the  moment  that 
sliding  is  about  to  take  place.  The  forces  that  act  on  the  object  are  the  normal  force, 
N,  that  is  the  grasping  force  applied  by  fingers  pushing  into  the  object,  the  weight 
of  the  object  IT,  and  the  frictional  force,  /.  Because  the  object  is  in  equilibrium,  the 
resultant  external  force  acting  on  it  must  be  zero, 


J2F  =  f  +  IT  +  iV  =  0. 


(3.16) 


The  X  component  of  this  vector  equation  gives, 


J2F^  =  f-W  =  0.  (3.17) 

At  equililrrium,  the  static  frictional  force  has  its  maximum  value.  Using  Equation  3.15 
and  Eciuation  3.16,  we  get 


/  =  fisN  =  IT. 


(3.18) 
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Tw 

Figure  3-10:  A  free-body  diagram  of  object  on  skin. 


When  an  object  is  grasped,  the  finger  is  positioned  using  PID  control  such  that  a 
firm  grasp  is  achieved  by  having  a  constant  N.  is  a  combination  of  fig  of  latex 
and  /Iso,  of  the  object,  and  14'^  is  object  dependent,  therefore,  a  relationship, 


+  p  so 

w 


=  constant 


(3.19) 


can  be  achieved.  When  a  learning  tool  is  available  such  as  described  in  Chapter  4, 
various  grasping  positions  can  be  considered  to  improve  the  skill  as  infants  do  during 
their  manipulation  exploratory  stage. 
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3.2  Computation  Tools 

3.2.1  Spinal  Cord  Level  Computation 

A  motor  board  with  a  6811  and  a  sensor  interface  board  are  mounted  on  a  dorsum 
as  shown  in  Figure  3-11.  They  function  like  a  spinal  cord  hy  conti  oiling  fingei  move¬ 
ments  such  as  reflexes.  The  Motorola  MC6bHClllK4  includes  CPU,  24  Ivbjtes  of 


Figure  3-11:  A  picture  of  the  dorsum  with  a  motor  board  and  sensor  interface  board 
mounted. 

EPROM,  640  bytes  of  EEPROM,  768  bytes  of  RAM,  four  8-bit  pulse-width  modu¬ 
lators,  8  channel  8-bit  analog-to-cligital  converters,  and  other  MC6811  features.  The 
K4  has  been  chosen  specifically  to  take  advantage  of  onboard  PWM  pulsors  with 
frecpiency  and  duty-cycle  variations  which  allows  the  whole  hand  to  be  controlled  by 
only  one  MC6811  chip  and  eliminate  a  complex  sequence  of  latches  and  flip-flops. 
PWM  frequency  can  be  specified  using  two  bytes  between  0.05Hz  to  40KHz  using 
an  8MHz  external  crystal  clock.  The  overall  picture  of  the  motor  board  design  is 
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Figure  3-12:  An  overall  picture  of  the  motor  board  design  using  Motorola  MC6811. 


shown  in  Figure  3-12.  The  board  is  designed  in  a  way  that  optoisolaters  are  used 
to  isolate  motor  signals  from  analog/digital  signals.  A  motor  driver  L293E  takes  a 
duty-cycled  PWM  signal  and  a  direction,  and  sends  a  processed  signal  to  a  motor. 
The  chip  is  also  capable  of  sensing  the  load  current  which  becomes  part  of  the  sensory 
information.  The  potentiometer  outputs  and  the  rest  of  the  sensory  information  are 
multiplexed  and  fed  to  the  analog-to-digital  converter  ports.  The  serial  line  is  used  to 
communicate  with  a  CPU  and  download  programs  to  EPROM  and  EEPROM,  and 
the  68332  interface  module  decodes  and  connects  to  a  SBC332  board  that  handles 
the  brain  level  computation. 
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Figure  3-13;  A  backplane  interfacing  16  processors. 

3.2.2  Brain  Level  Computation 

The  computation  done  at  this  level  is  the  massively  parallel  system  consisted  of  par¬ 
allel  iDrocessing  system  and  an  interlace  between  a  Macintosh  computer  acting  as  a 
front  end  processor(FEP)  and  a  processor.  The  design  is  done  in  a  way  that  the 
whole  process  can  be  expanded  to  16  backplanes  and  each  backplane  consisting  of 
16  processing  elements  as  shown  in  Figure  .3-13[15].  A  commercially  available  Vesta 
SBC332  Board  is  used  as  the  basic  processing  element,  each  dedicated  to  control  a 
specihc  subsystem  of  the  whole  robot.  Each  board  contains  a  Motorola  MC6833‘2 
microcontroller  and  onboard  RAM  and  EPROM  up  to  1  Mbyte  each.  Those  inde¬ 
pendently  controlled  processors  communicate  through  dual  port  RAMs(DPRAMs), 
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which  allow  two  processors  to  share  the  memory  space  within  it,  permitting  informa¬ 
tion  exchange  with  other  processors  to  complete  tasks  such  as  hand-eye  coordination. 
Viewing  from  this  point,  the  MC6811  motorboard  acts  as  a  slave  of  this  system.  The 
FEP  is  interfaced  entailing  the  use  of  a  Motorola.  MC68332  to  act  as  an  intermediate 
front  end  processor(InterFEP).  FEP  and  InterFEP  are  interfaced  with  a  SCSI  bus  and 
InterFEP  and  the  backplanes  are  interfaced  through  a.  serial  port.  The  programming 
environment  is  based  on  the  Macintosh  and  in  particular  runs  in  Macintosh  Common 
Lisp.  L,  developed  by  Brooks[4]  is  a  downwardly  compatible  subset  of  Common  Lisp 
and  it  is  run  on  each  MIMD  machine  node.  L  is  used  to  program  the  high  level 
learning  routines  that  are  introduced  in  the  next  chapter. 


3.3  Sensors 

3.3.1  Exteroceptors 

Manipulation  learning  does  not  occur  without  fully  utilizing  exteroceptor  and  propri¬ 
oceptor  sensory  feedback.  As  exteroceptors,  force  sensing  resistor(FSR)  devices  which 
resemble  membrane  switches  are  used.  The  sensors  are  less  than  0.15  mm  thick  film 
that  are  wrapped  around  the  surface  of  the  hngers  and  the  palm.  The  construction 
of  the  sensor  is  based  on  two  polymer  films  of  sheets  as  shown  in  Figure  3-14.  A 
conducting  pattern  is  deposited  on  one  polymer  in  the  form  of  a  set  of  interdigitating 
electrodes  and  a  proprietary  semiconductive  polymer  is  deposited  on  the  other  sheet. 
The  sheets  are  faced  and  laminated  together  with  a  combination  adhesive  spacer  ma¬ 
terial.  With  no  applied  force,  the  resistance  between  the  electrodes  is  high,  and  the 
resistance  drops  as  the  force  increases,  following  a  power  law  relationship.  Two  2  inch 
X  2  inch  scpiare  FSRs  are  wrapi^ed  around  each  phalange.  For  the  palm,  four  posi¬ 
tion  sensing  resistors(PSRs)  and  a  large  FSR  which  covers  the  entire  palm  are  used. 
A  linear  potentiometer,  a  kind  of  PSR  shown  in  Figure  3-15,  measures  the  position 
of  an  applied  force  along  its  sensing  strip.  A  voltage,  generally  5  volts,  is  applied 
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Semi-conducting  Interdigitating 

Polymer  Electrodes 


Figure  3-14:  Commercial  Force  Sensing  Resistor  structure. 


Figure  3-15:  Commercial  Position  Sensing  Resistor  structure. 
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Figure  3-16:  PSR  equivalent  circuit. 


between  the  Hot  and  Ground  ends  of  the  fixed  resistor  strip.  When  force  is  applied 
to  the  force  sensing  layer,  the  wiper  contacts  are  shunted  through  that  layer  to  one  of 
the  conducting  fingers  of  the  resistor  strip.  The  voltage  read  from  the  wiper  is  thus 
proportional  to  the  distance  along  the  strip  that  the  force  is  applied.  An  equivalent 
circuit  for  this  arrangement  is  shown  in  Figure  3-16.  Position  sensing  resolution  can 
be  approximated  by 

'2w^ 

Ax  =  (3.20) 

Wf 

where  Wg  is  the  width  of  the  conductive  fingers,  normally  0.5  mm,  and  wj  is  the  width 
of  the  applied  force  with  an  assumption  of  a  constant  force  across  the  force  footprint. 
One  drawback  of  this  material  is  that  the  force  measurement  is  of  one  point  only. 
If  multiple  locations  are  stimulated,  the  barycentric  position,  a  positional  average 
weighted  over  the  force  distribution, 

Sgrlund'^F{x)dx 

d^ave  —  /joi  ,  [6.ZI) 

i ground  F[x)dx 

wdrere  x  is  the  positions  of  contact  and  F[x)  is  the  force  distribution,  is  measured. 

Since  these  measurements  are  processed  at  a  sensor  interface  board  on  the  dorsum, 
wires  must  go  through  the  inside  of  the  phalanges  and  the  constantly  moving  joints. 
To  accommodate  the  situation,  the  sensor  is  modified  b}^  eliminating  an  interface  strip 
and  attaching  commercially  available  durable  and  flexible  wires  to  the  surface  of  the 
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sensor  interface  board 
- 1 


MOTOR 

BOARD 


Figure  3-17:  A  block  diagram  of  a  sensor  interface  board. 


film  using  a  conductive  adhesive  epoxy.  The  epoxy  is  chosen  so  that  it  solidifies  in 
room  temperature,  avoiding  to  melt  the  film  of  the  sensor,  and  the  hardness  value 
is  low  when  solidifies.  All  the  wires  are  connected  to  a  sensor  interface  board  where 
the  sensed  resistance  values,  Rfsrs  and  RpsRS,  are  processed.  The  block  diagram  of 
the  interface  board  is  shown  in  Figure  3-17.  The  FSR  signals  are  interfaced  using  a 
simple  force  to  voltage  conversion  as  shown  in  Figure  3-18.  The  output  is  described 
by  the  equation. 


\A.,  = 


+ 


1  +  RfsrJ  Rm 


(3.22) 


where  Vout  is  the  output  voltage,  V+  is  the  supply  voltage,  and  Rm  is  the  measuring 
resistor  value.  According  to  the  equation,  the  voltage  output  increases  proportional 
to  increasing  force.  Rm  is  chosen  to  maximize  the  desired  force  sensitivity  range  and 
4.7  KU  is  used  for  the  hand.  For  the  PSRs,  an  output  can  be  read  through  a  simple 
voltage  follower  as  a  buffer  as  shown  in  Figure  3-19.  In  order  to  prevent  the  high 
current  from  flowing  through  the  sensor  during  the  measurement,  it  is  important  to 


3.3.2  Proprioceptors 

Proprioceptors  respond  to  changes  in  the  position  of  the  body  or  its  parts.  Funda¬ 
mentally,  the  use  of  motors  is  not  anthropomorphical  since  joints  for  human  are  not 
controlled  by  rotational  forces,  but  a  force  applied  by  muscles.  Muscles  receive  an 
abundant  supply  of  nerve  endings  acting  as  proprioceptors,  while  their  functionality 
is  still  not  very  clear.  Since  muscles  are  still  impossible  to  implement  in  the  way  our 
muscles  work,  the  usage  of  actuators  is  not  avoidable.  With  an  implementation  using 
motors,  it  is  possible  to  measure  the  rotational  position  of  the  motors.  To  minimize 
size  and  weight,  rotational  potentiometers  are  used  instead  of  optical  encoders  as 
shown  in  Figure  3-20.  The  information  gathered  is  filtered  through  an  RC  circuit 
and  processed  to  an  8  bit  digital  signal  at  a  MG6811  analog-to-digital  converter  port. 
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This  gives  a  2®  resolution  jaer  a  rotation  of  a  motor,  which  accounts  to  180°  resolution 
between  curled  and  expanding  configuration  of  a  finger,  which  is  much  more  precise 
than  what  humans  are  capable  of  measuring  without  visual  feedback. 

Another  proprioceptor  used  is  a  current  sensor  that  is  a  built  in  capability  of  a 
L293E  motor  driver.  A  load  current  ,  which  can  be  as  high  as  two  volts,  is  converted 
to  voltage  information  with  a.  resistor  avoiding  a  high  current  flow  to  the  microcon¬ 
troller.  This  information  both  protects  the  motor  from  overheating,  and  permits  a 
measurement  of  how  hard  a  finger  is  at  work  at  each  instance  of  a  grip. 
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Chapter  4 


Learning  Process 


Learning,  storing  past  experience  in  the  brain  to  guide  future  action,  is  an  effective 
way  of  refining  hand  movements.  In  the  early  20th  century,  Ivan  Pavlov  argued  that 
conditioned  reflexes  form  a  basis  for  all  learned  behavior.  In  the  1930’s  Burrhus  F. 
Skinner  argued  that  only  outcomes  such  as  rewards  and  punishments  caused  learning, 
though  many  psychologists  argued  against  it.  As  of  today,  the  nature  of  learning  is 
still  not  clear. 

This  chapter  presents  two  nervous  systems  that  have  been  implemented  for  this 
thesis.  One  is  a  system  that  is  normally  controlled  at  the  spinal  cord  level  such  as 
reflex,  and  the  other  is  a  higher  level  learning  system  that  utilizes  neural  network 
theory.  The  overall  nervous  control  system  is  shown  in  Figure  4-1. 


4.1  Low  Level  Controller 

The  low  level  calculations  are  all  done  in  the  MC6811  mounted  on  the  palm  and 
programmed  using  Assembly  language. 
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Outp 


Figure  4-1:  A  block  diagram  of  overall  nervous  control  system  implemented. 
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Figure  4-2:  A  simple  block  diagram  of  feedback  control  system. 

4.1.1  PID  Control 

The  control  structure  for  the  finger  movement  needs  to  be  a  closed  loop  to  compensate 
for  noise  from  the  environment  and  to  let  the  system  converge  at  all  times.  The 
dynamics  of  a  DC  motor  in  a  control  loop  shown  in  Figure  4-2  can  be  expressed  as 


rl  +  A'o$  +  $  =  Ah  (Cin  +  K2T1) 


(4.1) 


where  r  is  the  time  constant,  $  is  the  motor  rotational  position,  is  the  input 
from  a  controller,  Ti  is  the  load  torque,  and  the  A'^’s  are  constants  related  to  the 
motor  characteristics,  r  is  determined  by  the  characteristics  of  the  motor  and  when 
it  becomes  smaller,  the  closed  loop  system  becomes  faster  and  more  desirable.  From 
Ecfuation  4.1,  using  Laplace  Transform,  the  motor  process  can  be  written  as 


Motor  = 


Ki 

TS^  -f  A'o-s  +  1  ■ 


(4.2) 


For  this  system,  a  proportional  plus  integral  plus  derivative(PlD)  controller  is  chosen 
because  of  its  ability  to  provide  an  acceptable  degree  of  error  reduction  while  simulta¬ 
neously  providing  sufficient  stability  and  damping)!)].  For  this  system,  the  controller 
can  be  written  as 

Controller  =  G'(l  +  — - h  Tos)E 

Tis 


(4.3) 
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Figure  4-3:  A  Ijlock  diagram  of  finger  position  control  system. 


where  G  is  the  feedback  gain,  T/  is  constant  called  integral  time,  Td  is  a  constant 
called  derivative  time,  and  E  is  the  error.  For  the  sensor,  a  potentiometer  reading 
$  is  used,  so  the  gain  for  the  sensor  is  1.  The  system  described  above  is  shown  in 
Figure  4-3,  and  the  output  is  calculated  to  be 

luEGTpTis^  +  {EGTr  +  IGTiTilEG).^  +  EG 

r/3(rs2  +  Kos  +  1) 

where  E  is  A$  containing  no  s  term.  Therefore,  this  system  converges  with  time  at 
all  time. 

4.1.2  Reflex 

Reflex  is  a  system  that  is  controlled  at  the  spinal  cord.  A  curling  reflex,  only  exists 
for  infants,  and  allows  the  fingers  to  curl  when  the  inner  surface  of  palm  is  touched.  A 
releasing  reflex  reaction  occurs  when  an  intolerable  amount  of  stimulus  is  applied  to 
the  skin.  A  releasing  reflex  is  useful  for  both  avoiding  the  physical  damage  of  the  hand 
and  to  learn  the  limit  of  its  capability.  A  curling  reflex  is  important  at  the  learning 
stage,  but  it  can  be  eliminated  eventually.  Based  on  these  ideas  a  very  simple  reflex 
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system  is  implemented  using  force  sensing  resistor(FSR)  sensory  feedback.  When  an 
FSR  senses  a  signal  higher  than  its  threshold  resistance,  the  joints  are  commanded 
in  a  way  that  the  finger  moves  in  the  opposite  direction  from  where  the  stimulus  is 
applied.  The  normal  command  sent  to  motors  are  overwritten  by  the  reflex  signals. 
If  the  inner  skin  is  weakly  stimulated,  all  the  fingers  are  commanded  to  curl  until  the 
sensor  reading  reaches  30.  This  pressure  is  not  strong  enough  to  hold  an  object,  but 
it  simulates  babies’  reflex  systems.  This  curling  reflex  initiates  the  learning  process 
described  in  the  next  section. 


4.2  High  Level  Neural  Networks 

Due  to  the  lack  of  visual  and  auditory  feedback,  only  the  primitive  learning  processes 
that  occur  locally  for  the  hand  are  considered  in  this  thesis.  For  infants,  different 
learning  processes  occur  interactively  and  simultaneously.  For  example,  think  of  a 
situation  where  an  infant  tries  to  lift  an  object  off  the  ground,  grasping,  lifting  the 
hand,  and  failing  to  lift  up  the  object.  From  visual  feedback,  the  infant  recognizes 
that  the  object  has  slipped  off  the  hand.  By  repeating  this  process,  they  learn  to 
connect  the  visual  “slip”  with  their  sensory  information.  Adults  can  apply  the  right 
amount  of  force  to  hold  an  olrject  by  appljdng  enough  by  not  excessive  force  to  an 
object  without  slipping.  This  operation  is  possible  due  to  repeated  practice  at  the 
initial  grasping  learning  stage.  Simultaneously,  when  the  infant  touches  and  drops  the 
object,  joint  proprioceptors  and  exteroceptors  on  the  skin  react  in  a  certain  way.  After 
some  repetitions,  the  infants  connect  the  relationship  of  sensory  information  with 
objects’  hardness,  texture  and  weight.  All  those  separate  learning  processes  merge  to 
create  our  consistent  stable  manipulation  skills.  For  this  thesis  I  implemented  three 
learning  processes  separately,  each  utilizing  neural  networks  using  different  strategies. 
First,  object  hardness  recognition  learning  is  conducted  using  a  competitive  learning 
strategy.  Second,  a  three  layered  backpropagation  algorithm  is  used  to  train  the  shear 
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detection.  By  applying  those  two  trained  networks,  the  optimal  grasping  action  is 
searched  by  a  reinforcement  learning  strategy  (that  is  somewhat  similar  to  Q-learning 
technique). 

4.2.1  Hardness  Recognition  Network 

Theory  of  Competitive  Learning 

Topologically,  there  is  substantial  evidence  for  the  spatial  self-organization  of  brain 
areas  that  contain  sensory  or  motor  maps  [8].  For  some  stimuli,  there  is  some  form  of 
competition  between  activities  of  neurons  on  the  neural  surface.  The  idea  of  compet¬ 
itive  learning  was  originally  proposed  by  Rosenblatt  [29],  and  implemented  by  manj^ 
[17]  successfully.  Competitive  learning  contains  lateral  feedback,  which  depends  on 
the  lateral  distance  from  the  point  of  its  application.  From  biological  inspiration, 
lateral  feedback  is  described  by  a  Mexican  hat  function,  shown  in  Figure  4-4.  A  short 


The  Mexican  hat  function  of  iateral  connections. 


Figure  4-4:  The  Mexican  hat  function  of  competitive  learning  lateral  connections. 

range  lateral  feedback  has  an  excitatory  effect  and  a  penumbra,  lateral  feedback  has 
an  inhibitory  effect.  The  output  signal  of  neuron  i,  y;,  at  time  step  n  +  1  can  be 
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expressed  in  a  following  difference  equation: 

p  K 

y,{n  +  1)  =  ^  H  (^ikyi+ic{n)),  fori  =  1,  2, (4.5) 

j=l  k=-K 

where  V’(.f)  is  some  nonlinear  function  to  ensure  yi  >  0,  Wij  is  the  synaptic  weight  of 
jth  feedforward  connection,  p  is  the  number  of  input  terminals,  xj  is  the  jth  input 
signal,  (5  is  the  feedback  factor  that  controls  the  rate  of  convergence  of  the  relaxation 
process,  K  is  the  radius  of  the  lateral  interaction,  Cik  is  the  lateral  feedback  weight 
connected  to  neuron  f,  and  N  is  the  number  of  neurons  in  the  network. 

Application 

Utilizing  competitive  learning  theory,  the  hardness  of  objects  can  be  categorized  over 
time.  The  experiment  is  conducted  with  eight  different  objects  of  same  size  and 
different  compressibilities.  Each  object  is  touched  by  curling  one  finger  around  the 
object  very  slowly.  Precisely  taking  three  seconds  to  fold  fingers  fully,  hold  for  two 
seconds,  and  straighten  the  finger  taking  three  seconds.  The  sensory  readings  are 
taken  from  both  force  sensors  on  the  finger  and  the  potentiometer  reading  of  the  motor 
controlling  the  finger  which  are  converted  to  an  eight  bit  digital  information.  The 
program  is  written  in  6811  in  a  way  that  the  readings  are  recorded  every  0.14  seconds. 
The  raw  data  extracted  from  a  finger  is  shown  in  Figure  4-5.  The  potentiometer 
reading,  p(t),  indicates  the  position  of  the  finger.  The  derivative  of  p{t)  has  three 
distinct  characteristics. 

Cl  Ci  =  constant  0 

—62(1  —  )  +  63  (4.6) 

0 

As  the  finger  curls,  the  motor  moves  at  the  constant  rate  when  the  finger  does  not 
contact  the  object  surface.  At  this  stage,  dp{t)ldt  is  a  non  zero  constant.  When 
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raw  data  from  force  sensor  ra"  tl^ta  from  potentimeter 


Figure  4-5:  Hardness  recognition  raw  data  extracted  from  a  meduim  hardness  object 
during  the  finger  folding  stage. 


the  object  is  firmly  grasped,  the  finger  stops  curling  and  results  in  dp{f)ldt  — >  0. 
The  significant  difference  between  different  hardness  object  can  be  seen  in  the  stage 
where  dp(^t)jdt  is  not  constant.  This  stage  signifies  that  the  object  and  the  finger 
are  in  contact,  but  the  object’s  compliancy  is  letting  the  finger  continue  to  move. 
Objects  have  a  constant  compliance  factor,  Ca,  which  is  proportional  to  the  hardness 
of  the  object.  The  comparison  of  two  objects  with  different  hardness  are  shown  in 
Figure  4-6.  It  seems  as  if  the  hardness  of  objects  can  be  categorized  using  only  this 
information.  However,  repeated  experiments  with  the  finger  shows  some  unexpected 
results  which  may  not  be  relevant  to  humans  because  of  our  superior  tactile  sensory 
system.  Due  to  the  nature  of  the  force  sensors,  they  are  not  capable  of  sensing  a  force 
smaller  than  20  grams.  When  a  very  spongy  object  is  grasped,  the  sensor  cannot 
detect  the  contact  until  the  object  is  scjuashed  enough  to  give  some  force  back  to  the 
finger.  Therefore,  a  very  spongy  object  gives  a  similar  response  as  a  hard  object. 
One  sensory  difference  in  those  two  objects  is  the  force  reading  when  the  object  is 
completely  compressed  and  held.  Since  the  spongy  object  has  the  resistance  force 
orthogonal  to  the  finger  surface,  the  force  reading  is  much  higher  than  for  the  harder 
object.  These  analysis  show  why  both  potentiometer  and  force  sensor  information 


4.2.  HIGH  LEVEL  NEURAL  NETWORKS 


63 


dp(t)/dt  for  a  soft  object  dp(t)/dt  for  a  hard  object 


Figure  4-6:  comparison  for  soft  and  hard  objects, 

are  crucial  in  distinguishing  the  object  hardness. 


Experimental  Results 

For  each  curling  experiment,  two  numbers  are  extracted  and  recorded.  The  first  is 
the  duration  of  dp{t)/dt  non  constant  time,  At.  It  is  expressed  in  digital  units  where 
one  unit  is  0.14  seconds.  The  other  is  the  maximum  force  sensor  reading  expressed 
in  a  seven  bit  digital  number.  Eight  bit  information  is  shifted  one  to  the  right  to 
eliminate  small  noise.  Eight  different  objects  are  tested  ten  times  each  and  the  results 
are  plotted  in  Figure  4-7.  Using  those  data  as  inputs,  a  2  layer,  6  neuron  competitive 
network  is  constructed  with  random  initial  synaptic  weights  and  trained.  Figure  4- 
8  shows  the  trained  neurons  over  the  input  map  as  they  get  trained.  Since  this  is 
unsupervised  learning,  the  initial  randomness  can  confuse  the  neurons  to  categorize 
somewhat  different  from  what  was  intended  when  the  training  session  is  too  short  or 
the  learning  rate  is  too  high(a  confused  neuron  is  shown  in  Figure  4-9).  Even  with 
bad  initial  random  weights,  such  as  the  one  causing  the  confused  neuron,  the  result 
converges  after  500  epochs.  Once  the  netrvork  is  trained,  different  inputs  can  be  fed 
to  the  network  to  find  the  category  of  the  touched  object.  This  strategy  works  well 
for  this  purpose  since  there  is  no  clear  cut  way  to  categorize  objects.  The  trained 
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Figure  4-7:  Competitive  learning  input  data. 

network  was  tested  with  data  taken  from  objects  not  used  for  training  and  shown  in 
Figure  4-10.  With  very  diverse  test  objects,  the  sensory  readings  fell  closely  to  the 
trained  neurons.  Initially  training  the  network  with  six  diverse  hardness  categories 
gives  a  good  distribution  of  graspaFle  objects.  Even  if  an  object  with  dramatically 
differeirt  compliancy  is  found,  it  onl}^  takes  roughly  10  experiments  to  take  data  and 
500  epochs  to  retrain,  all  of  which  takes  less  than  one  minute  to  do. 

4.2.2  Shear  Detection  Network 

Theory  of  Back-Propagation  Algorithm 

The  back-propagation  algorithm  is  the  most  popular  application  of  multilayer  percep- 
troirs  for  supervised  learning.  The  process  consists  of  a  forward  pass  and  a  backward 
pass  with  known  desired  output  signals  d{n)  where  n  is  the  instance  of  the  number 
of  training.  The  inputs  is  applied  to  the  forward  pass  network  and  fed  through  layer 
by  layer.  The  net  internal  activity  level  vl‘\n)  for  neuron  i  in  layer  I  is 

i=o 


(4.7) 


sensor  reading(digital)  sensor  reading(digitaq  sensor  reading{digital) 
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Competitive  Learning:  1  cycles  Competrtive  Learning;  100  cycles 


delta  time(0.14  sec/unit)  delta  time(0.14  sec/unit) 


Figure  4-8:  Hardness  recognition  competitive  learning  training  steps:  ’-f  ’  are  inputs 
and  ’o’  are  the  neurons. 


66 

CHAPTER  4.  LEARNING  PROCESS 

Competitive  Learning:  40  cycles 
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Figure  4-9:  Competitive  learning  containing  confused  neuron(t] 

le  two  neurons  around 

■  (17,70)  should  be  spread  apart  to  the  other  input  cluster  around  (20,  60)): 

’-K’  are  inputs  and  ’o’  are  the  neurons. 

Trained  Networi<  with  various  inputs 
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Figure  4-10:  Hardness  Recognition;  Competitive  learning  trained  network  with  test- 

ing  inputs(all  the  test  inputs  are  clustered  around  the  existing  trained  neurons) 
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where  wlj{n)  is  the  synaptic  weight  of  neuron  i  in  the  la^^er  /  that  is  fed  from  neuron 
j  in  layer  /  —  1  at  iteration  n  and  yj~^{n)  is  the  function  signal  of  neuron  j  in  the 
layer  /  —  I.  At  the  output  of  each  neuron  in  all  the  layers  there  is  nonlinear  smoothing 
function,  a  sigmoid, 

vPM  =  - - /  (/)^  (4-8) 

1  +  exp{—v\  ^ (??.)) 

to  make  the  function  differentiable.  At  the  output  layer,  L,  the  set  of  outputs  is 
compared  to  the  desired  value  giving  an  error  signal, 


ei(n)  =  d,{n)  -  y]  \n) 


which  is  propagated  backward  layer  by  layer  against  the  direction  of  synaptic  con¬ 
nections  adjusting  the  synaptic  weights  in  the  following  manner: 

wfj’{n  +  1)  =  (n  -  1))  +  ySl‘\n)yf~^\n)  (4.10) 

where  r/  is  the  learning  rate,  a  is  the  momentum  constant,  and  the  local  gradient,  6 
is 


=  eiin)yl^\n){l  -  yl^\n))  (4.11) 

(4.12) 

k 

The  algorithm  is  to  iterate  these  computations  until  the  network  stablizes  within  the 
bounds  of  targeted  error. 


Application 

Visually,  it  is  obvious  when  an  object  slips  from  a  hand.  From  repeated  shear  expe¬ 
rience,  the  relationship  between  the  sensory  information  on  the  hngers  and  the  result 
develoj^s  for  infants.  Shear  is  locall}^  detectable  sensory  information  if  there  exist 
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multiple  rows  of  pressure  sensors  perpendicular  to  the  direction  of  slip.  With  the  way 
the  robot  hand  for  this  research  is  oriented,  the  palm  is  perpendicular  to  the  ground 
and  hngers  are  horizontal,  which  makes  the  three  fingers  orthogonal  to  the  direction 
of  slip.  In  order  to  simulate  the  shear  learning  process,  sensory  data  from  the  hngers 
are  used  as  inputs  and  the  visual  feedback  about  the  existence  of  shear  is  used  as 
the  desired  output  to  train  a  feedforward  network.  Since  shear  is  a  time  dependent 
process,  the  input  signals  have  to  contain  multiple  time  space  sensor  readings.  The 
size  of  the  input  signal  vector  is  dehned  as 

(roio^col)  —  (4.13) 

where  t  is  the  number  of  discrete  time  steps,  /  is  the  number  of  huger  sensors  used 
and  m  is  the  number  of  sensory  reading  levels.  This  size  needs  to  be  minimized  in 
order  to  speed  up  the  learning  operation.  Straight  out  of  the  microcontroller,  there 
are  m  =  2'  sensory  reading  levels.  Obviously  seen  from  equation  4.13,  it  will  take  all 
day  to  just  feed  forward  an  input  of  this  size.  Also  lor  a  noisy  environment,  this  is 
not  an  optimal  implementation.  As  a  solution,  m  is  reduced  to  two  numbers,  5  and  0, 
as  the  maximum  and  the  minimum  inputs.  Back-propagation  classiher  can  generalize 
the  numbers  between  maximum  and  minimum  well  with  an  optimal  number  of  layers 
and  without  overtraining.  When  the  data,  is  overtrained,  the  inputs  are  overhtted  and 
cannot  adapt  the  values  between  4  and  1.  Reducing  m  to  two  still  contains  enough 
information  conserving  the  physics  of  shear  and  makes  the  calculation  much  simpler 
and  faster.  Since  slipping  is  not  a  reversible  operation  without  an  external  force 
applied,  recording  two  discrete  time  steps  with  an  optimal  step  size  is  satisfa.ctory.  If 
the  step  is  too  small,  most  of  the  calculation  will  be  wasted  detecting  no  changes  in 
the  readings.  However,  if  the  step  is  too  large,  the  shear  will  not  be  detected  cpickly 
enough.  To  calculate  the  maximum  speed  of  object  slipping,  assuming  no  friction. 
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the  equation 


1 

X  —  Xo  =  Vot  +  —at 


2 


(4.14) 


can  be  used.  Since  uq  =  0,  time  it  takes  for  a  point  of  an  object  to  slip  from  one 
finger  to  the  other  is  0.06  seconds.  Therefore  a  ste]^  size  of  0.28  seconds  is  chosen. 
With  those  assumptions,  the  input  vector  has  the  size  of  (6,  64).  When  the  columns 
of  the  vector  are  examined,  there  are  10  columns  of  inputs  that  are  not  realistic  or 
are  ambiguous  so  the  size  can  be  reduced  even  more  to  (6,  54).  The  desired  output 
data  is  one  bit  information,  1  being  shear  detected  and  0  being  no  shear. 


Experimental  Results 

Having  six  input  nodes  and  one  output  neuron,  a  four  la.yer  with  two  hidden  layer 
feedforward  network  is  constructed.  Because  of  the  simplification  made  for  the  sensory 
inputs,  by  rounding  up  the  data  and  reducing  the  m  to  smaller  numbers  as  following, 

'  81  ~  127  ^5 

61  ~  80  ^4 

41  ~  60  ^3 

1  (4.15) 

21  ~  40  ^2 

1  ~  20  ^1 

0  -^0 

a  four  layer  network  was  found  most  optimal  for  the  generalization  to  occur  well.  The 
inputs  are  taken  from  the  sensors  on  the  three  fingers  as  the  fingers  curled  around  the 
given  object,  a  paper  cup.  Since  the  hardware  is  not  ready  to  run  the  hand  completely 
autonomously,  some  external  force  was  applied  to  reach  the  grasping  figure.  Wdien 
slip  is  not  detected  the  computer  is  given  a  default  signal  0  which  signifies  the  non¬ 
slip  stage.  When  it  is  detected,  a  1  is  manually  typed  in  through  the  serial  port  as  a 
visual  feedback  signal  overwriting  the  default  input.  Again,  since  the  visual  system 
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network 

learning  rate 

2  neuron  hidden  layer  network 

1.58 

6  neuron  hidden  layer  network 

1.01 

15  neuron  hidden  layer  netowork 

0.65 

Table  4.1:  The  optimal  learning  rate  for  each  network. 


is  not  at  the  stage  where  it  can  cooperate  with  the  hand,  the  experimenter  has  input 
the  signal  when  the  visually  obvious  slip  is  detected.  After  enough  cases  of  slip  were 
introduced,  all  the  sensory  data  was  recorded  and  the  network  was  trained  separately 
from  the  hardware.  Eventually  I  would  hke  to  train  the  network  on  line,  but  without 
having  the  real  visual  feedback,  manual  labor  is  overwhelming.  To  record  one  set  of 
input  vector  to  run  one  epoch,  about  50  different  slips  are  manually  inputed.  And 
to  train  the  networks,  at  least  500  epochs  are  required.  In  the  training  session,  the 
number  of  hidden  layer  neurons  and  the  learning  rate  were  varied  to  find  the  optimal 
back-propagated  networks.  The  number  of  neurons  in  the  first  hidden  layer  was  fixed 
to  6  to  match  the  number  of  input  nodes.  The  number  of  neurons  in  the  second 
hidden  layer  is  deviated  to  2,  6  and  15  neurons.  Setting  the  desired  sum-squared 
network  error, 

E(n)  = 

“  iec 

to  0.006,  I  have  trained  the  networks  with  different  learning  rates.  If  the  desired 
error  was  not  reached  within  500  epochs,  the  training  was  stopped.  The  results  are 
graphed  and  shown  in  Figure  4-11,  Figure  4-12  and  Figure  4-13.  Since  the  initial 
random  weights  give  different  sum-scpiared  error  initially,  comparing  the  speed  of 
convergence  between  different  networks  is  not  relevant.  Intuitively,  all  the  networks 
converges  faster  when  the  learning  rate  is  increased.  However,  as  soon  as  the  learning 
rate  exceeds  the  fastest  convergent  point,  the  systems  never  converge  and  seem  to  get 
stuck  in  a  local  minima  at  E{n)  =  19.00.  The  optimal  learning  rate  for  each  network 
is  shown  in  Table  4.1.  Even  if  the  system  converges  at  the  end,  the  error  does  not 
stably  decrease  when  the  learning  rate  is  higher.  This  makes  the  system  unreliable 
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Sum-Squared  Network  Error  for  500  Epochs  2  Sum-Squared  Network  Error  for  227  Epochs 


Epoch  Epoch 

Figure  4-13:  Fifteen  neuron  hidden  layer  training  result: 

top  left,  7]  =  0.1; 

top  right,  T]  =  0.5; 

bottom  left,  rj  =  0.6; 

bottom  right,  rj  =  0.8. 
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since  depending  on  the  inputs,  the  system  has  a  possibility  of  finding  a  local  minima 
and  never  converging.  Even  though  this  problem  may  be  solved  when  it  is  run  on  line 
with  some  noticable  noise  which  can  disturb  away  from  local  minima,  it  does  make 
the  system  more  reliable  by  picking  a  good  middle  ground  learning  rate.  As  far  as  the 
number  of  hidden  neurons  are  concerned,  the  calculation  time  increases  significantly 
as  more  neurons  are  added.  Even  though  the  network  containing  larger  hidden  layers 
can  take  higher  learning  rate  stably,  if  each  epoch  takes  longer  to  calculate,  the 
advantage  is  diminished.  Eor  this  specific  experiment,  six  hidden  neurons  for  both 
hidden  layers  and  having  f.O  learning  rate  seems  to  be  the  most  optimal  solution, 
though  this  may  change  as  the  system  is  trained  on  line  in  the  future.  Average 
outputs  of  a  trained  network  taken  under  many  operations  containing  slips  are  shown 
in  Table  4.2,  where  input  difference  is  the  most  significant  sensor  reading  difference 


Input  Difference 

Slipped? 

Output 

5 

yes 

0.9863 

4 

yes 

0.9863 

3 

yes 

0.9867 

2 

yes 

0.9905 

1 

yes 

0.9903 

0 

no 

0.0007 

-1 

no 

0.0002 

-2 

no 

0.0003 

-3 

no 

0.0003 

-4 

no 

0.0003 

-5 

no 

0.0003 

Table  4.2:  Trained  slip  detection  network  output  with  testing  inputs. 


between  two  reaclings.  The  output  is  well  categorized  even  for  the  inputs  that  are  not 
used  for  training  such  as  1  to  4. 
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4.2.3  Grasping  Action  Network 


Theory  of  Reinforcement  Learning 


Reinforcement  learning  is  based  on  a  common  sense  idea  that  if  an  action  is  fol¬ 
lowed  by  a  satisfactory  state  of  affairs,  then  the  tendency  to  produce  that  action  is 
strengthened [33].  This  idea  was  initially  studied  in  psychology  by  Pavlov  in  learning 
work  with  animals.  In  neural  networks,  the  studies  are  focused  on  actor-critic  learning 
algorithm  or  Q-learning,  both  based  on  the  temporal  difference  method[33,  35,  37]. 
An  actor-critic  system  has  two  subsystems,  one  is  an  evaluation  network  which  esti¬ 
mates  the  long  term  utility  for  each  state  and  the  other  is  a  policy  network  which  learn 
to  choose  the  optimal  action  in  each  state.  A  Q-learning  system  maintains  estimates 
of  utilities  of  all  state-action  pairs  and  utilizes  them  to  select  a  suitable  action.  The 
object  of  Q-learning  is  to  estimate  a  real- valued  function,  Q,  of  states  and  actions, 
where  Q{x,a)  is  the  expected  discounted  sum  of  future  reward  for  performing  action 
a  in  state  x  and  performing  optimally  thereafter.  This  relationship  can  be  expressed 
as: 

QixnWn)  =  E  {r„  +  jMax{Q{xn+i,y))}  (4.17) 

where  r„  is  an  immediate  reward  at  step  n,  7  is  a  discount  factor,  0  <  7  <  1,  and  y 
is  the  next  state.  The  estimation  of  Q,  Qest  is  updated  at  each  time  step, 

Qe5^(^n,^n)  •  Q  est{^  71^  T  T  ,  ?/ j  j  Q  esti^ni  (4.18) 

where  /?„  is  a  gain  sec|uence,  and  all  the  estimation  is  maintained  within  the  function. 
A  gain  seciuence  has  a  characteristic  such  that  0  <  /?„  <  1,  /3„  =  oo  and 

<  oo.  Q-learning  has  been  proven  to  converge  at  all  time[3.5]. 
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Grasping  Action  Network 


Figure  4-14:  Grasping  Action  Network  block  diagram 


Application 

The  Q-learning  algorithm  assumes  that  the  system  can  observe  an  input  vector  at 
nth  iteration,  .t„,  action  chosen  at  Stochastic  Action  Selector  at  nth  eteration,  a„, 
reinforcement  value,  and  the  next  input  vector,  3:^+1  •>  at  each  time  step.  However, 
since  grasping  is  a  one  way  operation! meaning  open  close,  not  open  close),  Xn+i 
cannot  be  seen  at  the  end  of  the  iteration,  n.  Moreover,  Xn  is  already  analyzed  and 
categorized  using  competitive  learning  networks.  Implementing  with  a  connectionist 
idea,  internal  self  reinforcement  system  was  built  using  two  components  as  shown 
in  Figure  4-14.  The  first  system  is  a  Reinforced  Probability  Net,  RPN,  which  takes 
the  classified  information,  H{x),  from  hardness  recognition  network  and  a  set  of 
actions,  A  =  {01,02,  I  «(i)  =  ^  set  of  actuator  inputs  of  jth  action}.  It  outputs 
an  action  merit  vector,  M{A),  that  assigns  a  value  to  each  action.  The  second  system 
is  Stochastic  Action  Selector  that  takes  M{A)  and  selects  an  action  and  sends  the 
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information  to  the  actuators.  According  to  the  action  given,  the  shear  detection 
network  gives  an  output  which  can  be  converted  to  an  immidiate  payoff  value,  r. 
RPN  is  reinforced  using  TD  methods,  back-propagating  a  reinforced  correction  vector, 
RC{n).  The  simplified  algorithm  is  as  follows: 

1.  H{x)  <—  current  hardness  class;  for  each  action  a(i),  M{a[j))  <—  RPN{H(x),  a(j)); 

2.  a  ^  SAS{M{A))- 

3.  Perform  action  a; 

4.  Send  new  sensory  information  to  hardness  recognition  network  and  shear  de¬ 
tection  network;  {H{x),  S{x))  <—  new  hardness  class  and  shear  value; 

5.  r  =  -2S{x)  +  1; 

6.  RC  =  M{A)  +  {(r)]  where  ^  is  a  damping  constant. 

7.  Adjust  the  RPN  by  back-propagating  RC; 

8.  Go  to  1; 

There  are  two  ways  of  implementing  RPN.  Classified  RPN  is  shown  in  Figure  4- 
15.  There  are  only  two  layers  in  the  network  with  an  additional  neuron  selector  at 
the  output.  This  allows  the  M[A)  to  converge  faster  for  each  class,  though  when  a 
new  hardness  category  is  added,  it  has  to  relearn  by  adding  unattached  neurons  into 
the  network  and  start  from  a  scratch.  The  other  implementation  is  Mutiple  Layer 
RPN  which  uses  more  hidden  layers  and  feed  H{x)  with  the  action  vector  as  shown  in 
Figure  4-16.  For  this  system,  only  synaptic  weight  adjustment  is  made  for  the  existing 
neurons.  This  method  varies  in  the  time  of  I'etraining  depending  on  the  newly  given 
object.  For  this  experiment,  the  classified  RPN  is  chosen  to  use  due  to  the  calculation 
speed  and  limited  object  hardness  categories. 
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Figure  4-15;  Classified  RPN(  Back-propagated  on  the  solid  lines) 


hidden  1  hidden  2  hidden  k 


Figure  4-16:  Multiple  hidden  layer  RPN 
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Experimental  Results 

Six  set  classified  RPN  has  been  constructed  with  six  categories  from  the  hardness 
recognition  network.  Since  each  set  gives  a  similar  training  result,  only  one  class, 
H{x)  =  3  is  shown  in  this  section.  The  set  of  actions  has  been  determined  to  have 
eight  cases  of  grasping  potential  positions.  The  initial  weights  are  set  in  a  way  that 
each  action  has  equal  probability  of  being  chosen  at  the  beginning.  It  is  a  timely 
operation  since,  as  mentioned  before,  the  hardware  is  not  functional  enough  to  operate 
autonomously,  when  the  grasping  action  signal  is  received,  external  force  needs  to  be 
applied  to  achieve  the  grasping  position  and  the  slip  is  detected  and  input  by  the 
experimenter.  For  the  Classified  RPN  method,  the  number  of  epochs  can  be  quite 
small  to  achieve  an  optimally  trained  network.  There  are  two  variable  constants, 
learning  rate  and  damping  constant,  to  change  to  achieve  different  ways  of  training 
the  network.  The  learning  graphs  with  different  constant  values  in  a  short  period  of 
training  are  plotted  in  Figure  4-17  and  the  longevity  training  results  are  shown  in 
Figure  4-18.  When  (  is  too  small,  the  network  never  get  trained  as  desired  because 
the  system  is  not  reinforced  strongly  enough.  Though  as  long  as  ^  is  large,  rj  does 
not  need  to  be  large  to  learn  quickly  and  correctly.  When  both  ^  and  r)  are  too  large, 
the  system  falls  into  a  local  minimum  and  does  not  converge.  The  advantage  of  this 
system  is  that  once  the  networks  are  trained  within  the  desired  square-sum  errors,  as 
long  as  the  damping  constant  and  learning  rate  are  optimally  small,  the  system  can 
adapt  to  any  new  objects  that  are  to  be  grasped.  To  simulate  the  trained  network, 
the  action  chosen  was  output  to  a  computer  monitor  through  a  serial  port  so  that 
some  external  force  can  be  applied  to  achieve  the  desired  action.  For  a  well  trained 
network,  15  iterations  were  conducted  and  it  chose  one  action  that  can  achieve  the 
stable  grasp  every  time  as  shown  in  Table  4.3.  If  multiple  actions  can  accomplish  the 
grasp  desired,  the  output  actions  are  equally  divided  among  them. 
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Reinforcement  Learning  Weight  Change  Over  Time  Reinforcement  Learning  Weight  Change  Over  Time 


number  of  epochs  number  of  epochs 


Figure  4-18:  Training  over  a  long  period  of  time: 
left:  ^  =  0.3,  ?/  =  0.3; 
right:  ^  =  1,  77  =  1; 


H{x) 

a(n) 

Successful  Grasping/#  of  trials  total 

1 

2 

15/15 

2 

8 

9/15 

2 

6/15 

3 

6 

15/15 

4 

1 

7/15 

2 

7/15 

5 

5 

15/15 

6 

5 

14/15 

Table  4.3:  Stable  grasp  success  rate. 
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Chapter  5 
Conclusions 


5.1  Review  of  Thesis 

For  this  thesis,  a  self-contained  anthropomorphic  scaled  non-task  driven  tool  which 
learns  its  own  cognitive  and  physical  behavior  is  constructed.  The  physical  challenge 
was  minimizing  size  and  weight  of  the  hand  which  has  enough  strength  and  precision 
to  manipulate  objects.  Commercially  available  actuators  and  sensors  are  chosen,  and 
motor  and  sensor  controllers  are  designed  and  constructed.  The  controller  boards  are 
mounted  on  the  dorsum,  controlling  all  the  motors  and  sensors  of  the  hand.  When 
the  whole  system  was  integrated,  the  overall  weight  of  the  hand  was  less  than  1.9 
pounds.  The  arm,  which  is  under  construction,  is  capable  of  exerting  about  three 
2Dound  torque  at  the  tijD  of  the  hand,  without  the  weight  of  hand,  resulting  in  one 
pound  maximum  load  torque. 

The  cognitive  challenge  is  more  complex  because  the  i^roblem  itself  is  not  well- 
stated.  With  our  existing  technology  and  biological  facts,  very  limited  implementa¬ 
tion  was  made.  Utilizing  our  knowledge  of  nervous  system  organization,  low  level 
operation  is  executed  locally  at  an  MC6811  controller,  which  simulates  the  spinal 
cord.  It  contains  a  feedback  controller  which  stablizes  and  minimizes  the  error  of 
finger  positions,  and  a  reflex  system  for  the  fingers.  The  higher  level  learning  schema 
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is  designed,  trained  and  tested  on  MATLAB  and  later  will  be  implemented  on  an 
MC68332  controller  for  autonomy.  It  learns  to  distinguish  object  hardness  using  a 
competitive  learning  strategy,  learns  to  detect  shear  using  backpropagation  algorithm, 
and  learned  overall  simple  grasping  using  reinforcement  learning.  All  the  strategies 
used  are  defined  with  the  inspiration  of  human  neural  system  and  human  response 
to  given  stimuli,  but  the  implementation  of  them  is  not  necessarily  a  direct  model. 
This  system  simulates  the  surface  level  learning  strategy  shown  in  infants,  but  may 
not  coincide  with  human’s  actual  learning  process. 

5.2  The  Future 

5.2.1  Physical  Work 

By  building  a  system,  many  improvements  that  can  be  made  are  realized. 

STRUCTURE;  The  whole  structure  of  hand  can  be  made  even  smaller.  The  pieces 
are  made  larger  than  absolutelj''  necessary  for  building  simplicity.  When  a  part 
is  smaller,  the  error  ratio  becomes  higher  for  the  same  error  caused  in  machining. 
The  diameter  of  fingers  can  be  cut  in  half  if  the  pulleys  can  be  machined  to  fit 
the  need.  Motors  can  be  organized  as  shown  in  Figure  5-1  so  that  the  size  of 
the  palm  can  be  also  minimized.  For  more  compliancy,  spring  loaded  joints  for 
proximal  joints  could  be  considered  to  give  another  degree  of  freedom  of  an  axis 
perpendicular  to  the  existing  rotation  at  a  proximal  joint.  The  weight  can  be 
minimized  significantly  il  the  number  of  screws  are  reduced  by  building  more 
complicated  parts  instead  of  bolting  two  simple  pieces  together  using  screws. 

SENSORS:  Tactile  sensor  technology  needs  to  leap  a  big  step.  Sensors  need  to  be 
aligned  at  the  silicon  level,  giving  a  high  resolution  array  of  force  sensors.  The 
fundamental  idea  of  wires  and  connectors  needs  to  be  improved  or  changed  to 
adopt  a  tactile  system  that  can  be  integrated  in  a  human  form. 
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NOW  FUTURE 

Figure  5-1;  A  diagram  of  motor  alignment  improvement. 

COMPUTING:  If  the  motor  controller  and  sensor  interface  boards  design  does 
not  require  any  change,  it  may  be  implemented  on  a  chip  containing  all  the 
capabilities  needed.  This  should  significantly  reduce  the  size  and  weight  of  the 
system.  Eventually  this  should  be  mounted  in  the  spine. 

The  hand  will  be  soon  interfaced  with  the  arm,  connecting  to  the  whole  body. 
With  arm  manipulation  capability  and  the  existance  of  visual  and  auditory  feedback, 
a  door  will  be  opened  for  building  a  more  complex  system  that  triggers  many  new 
constraints  and  limitations. 

Biology  has  its  own  amazing  system  which  allows  organisms  to  live  and  function. 
It  is  the  duty  of  scientists  to  attempt  to  decode  the  organic  system  for  a  deeper- 
understanding  of  nature. 

5.2.2  Cognitive  Work 

Neural  networks  have  allowed  scientists  to  take  a  big  step  in  being  adaptive  and 
flexible  to  the  environment  which  is  rapidly  changing  and  is  full  of  noise.  However, 
all  the  learning  theories  that  are  implementable  today  unfortunately  contain  many 
assumptions  that  may  not  be  true  in  the  real  world.  For  example,  they  all  assume  a 
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perfect  Markov  decision  world;  the  complete  set  of  state  information  can  be  accessed 
by  the  agent  any  time. 

Human  cognition  is  still  a  black  box  that  neuroscientists,  philosophers,  computer 
scientists  and  many  more  are  required  to  keep  tackling  and  investigating.  As  a  tiny 
step,  the  attempt  to  understand  the  infants  manipulation  learning  by  implementating 
a  physical  hand  was  described  in  this  thesis.  Babies  may  not  use  a  learning  mechanism 
close  to  what  is  described,  but  when  the  whole  body  is  integrated,  we  may  discover  a 
phenomena  that  could  not  be  obvious  before.  Studying  infants  learning  system  seems 
to  be  a  suitable  starting  point  since  the  development  of  cognition  is  initiated  by  the 
social  interactions  and  learning  that  occurs  during  infancy.  To  get  a  closer  look  at 
cognition  itself,  a  much  simpler  physical  system  with  minimal  cognitive  assumptions 
may  need  to  be  build  to  tackle  even  lower  level  cognitive  problems. 

Afterall,  the  project  to  understand  human  cognition  has  just  started. 
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